Learning Apparatus, Learning Method, Information Processing Apparatus, Data Selection Method, Data Accumulation Method, Data Conversion Method and Program

ABSTRACT

There is provided a learning apparatus including: a first data acquisition unit which acquires first user preference data belonging to a first data space; a second data acquisition unit which acquires second user preference data of a user in common with the first user preference data, the second user preference data belonging to a second data space which is different from the first data space; a compression unit which generates first compressed user preference data having less data item number from the first user preference data by utilizing a first set of parameters; and a learning unit which learns a second set of parameters utilized for generating second compressed user preference data having the same data item number as that of the first compressed user preference data from the second user preference data so that difference between the first compressed user preference data and the second compressed user preference data is to be small across a plurality of users.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a learning apparatus, a learningmethod, an information processing apparatus, a data selection method, adata accumulation method, a data conversion method and a program.

2. Description of the Related Art

Recently, a variety of contents such as music, video, books and newsarticles have been provided to users via a network such as the internetwith progress of information technology. Since enormous amounts ofcontents are managed in such content providing services, users havedifficulties in finding by oneself an appropriate content which suitseach user. Accordingly, a technology called recommendation has beenutilized to acquire user's preference based on a user's action historysuch as purchasing or viewing, for example, and to select and proposecontents suitable for individual users.

Utilizing user preference data (UP) which indicates user's preference bynumerals or the like according to the user's action history in order toperform recommendation is one of the common points in many of theexisting recommendation technologies. For example, in recommendationalgorithm called collaborative filtering, the user preference data arecompared among different users so as to specify a user having similarpreference, and then, a content used by the user in the past is to bethe subject for recommendation. Examples of this recommendationalgorithm are disclosed in Japanese Patent Application Laid-Open Nos.2006-215867 and 2008-077386. Meanwhile, for example, in therecommendation algorithm called content-based filtering, user preferencedata and content property data indicating content properties in a commondata space are compared, and then, contents which are determined to besuitable for the user's preference may become the subject forrecommendation.

SUMMARY OF THE INVENTION

However, there is a case where the data spaces of the user preferencedata or the content property data are different from one another due todifferences in, for example, content areas to be the subject forrecommendation, device types to generate user preference data, vendorsof supplying devices or the like. When the data spaces are different,the range of contents capable of being recommended based on the userpreference data or the content property data will be restricted. Inaddition, there is a case where appropriate contents are difficult to berecommended because of insufficient accumulation of the action history,for example.

In light of the foregoing, it is desirable to provide a novel andimproved learning apparatus, a learning method, an informationprocessing apparatus, a data selection method, a data accumulationmethod, a data conversion method and a program which are capable ofcommonly managing user preferences or content properties among domainsof different data spaces.

According to an embodiment of the present invention, there is provided alearning apparatus including: a first data acquisition unit whichacquires first user preference data belonging to a first data space; asecond data acquisition unit which acquires second user preference dataof a user in common with the first user preference data, the second userpreference data belonging to a second data space which is different fromthe first data space; a compression unit which generates firstcompressed user preference data having less data item number from thefirst user preference data by utilizing a first set of parameters; and alearning unit which learns a second set of parameters utilized forgenerating second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromthe second user preference data so that difference between the firstcompressed user preference data and the second compressed userpreference data is to be small across a plurality of users.

The learning unit may learn the second set of parameters with the firstcompressed user preference data generated by the compression unit astraining data for the second compressed user preference data.

The compression unit may generate the first compressed user preferencedata in accordance with a multi-topic model.

The first set of parameters and the second set of parameters may be setsof parameters corresponding to intrinsic distribution of a topic of themulti-topic model.

The first data space and the second data space may be data spacescorresponding to content domains which are different from each other.

The first data space and the second data space may be data spaces ofuser preference data generated by devices which are different from eachother.

According to another embodiment of the present invention, there isprovided a learning method, including the steps of: acquiring first userpreference data belonging to a first data space; acquiring second userpreference data of a user in common with the first user preference data,the second user preference data belonging to a second data space whichis different from the first data space; generating first compressed userpreference data having less data item number from the first userpreference data by utilizing a first set of parameters; and learning asecond set of parameters utilized for generating second compressed userpreference data having the same data item number as that of the firstcompressed user preference data from the second user preference data sothat difference between the first compressed user preference data andthe second compressed user preference data is to be small across aplurality of users.

According to another embodiment of the present invention, there isprovided a program for causing a computer which controls an informationprocessing apparatus to perform functions including: a first dataacquisition unit which acquires first user preference data belonging toa first data space; a second data acquisition unit which acquires seconduser preference data of a user in common with the first user preferencedata, the second user preference data belonging to a second data spacewhich is different from the first data space; a compression unit whichgenerates first compressed user preference data having less data itemnumber from the first user preference data by utilizing a first set ofparameters; and a learning unit which learns a second set of parametersutilized for generating second compressed user preference data havingthe same data item number as that of the first compressed userpreference data from the second user preference data so that differencebetween the first compressed user preference data and the secondcompressed user preference data is to be small across a plurality ofusers.

According to another embodiment of the present invention, there isprovided an information processing apparatus, including: a dataacquisition unit which acquires first user preference data belonging toa first data space; a compression unit which generates first compresseduser preference data having less data item number from the first userpreference data by utilizing a first set of parameters; a storage unitwhich stores a plurality of data having the same data item number asthat of the first compressed user preference data, the plurality of databeing generated from second user preference data or content propertydata belonging to a second data space which is different from the firstdata space by utilizing a second set of parameters; and a selection unitwhich selects at least single data from the plurality of data stored atthe storage unit according to similarity degree to the first compresseduser preference data generated by the compression unit, wherein theplurality of data stored at the storage unit are respective datapreviously generated by utilizing the second set of parameters which islearned so that difference between the first compressed user preferencedata and second compressed user preference data generated from thesecond user preference data of a common user is to be small across aplurality of users.

According to another embodiment of the present invention, there isprovided a data selection method, including the steps of: acquiringfirst user preference data belonging to a first data space; generatingfirst compressed user preference data having less data item number fromthe first user preference data by utilizing a first set of parameters;and selecting at least single data from a plurality of data having thesame data item number as that of the first compressed user preferencedata according to similarity degree to the first compressed userpreference data, the plurality of data being generated from second userpreference data or content property data belonging to a second dataspace which is different from the first data space by utilizing a secondset of parameters, wherein the plurality of data are respective datapreviously generated by utilizing the second set of parameters which islearned so that difference between the first compressed user preferencedata and second compressed user preference data generated from thesecond user preference data of a common user is to be small across aplurality of users.

According to another embodiment of the present invention, there isprovided a program for causing a computer which controls an informationprocessing apparatus to perform functions including: a data acquisitionunit which acquires first user preference data belonging to a first dataspace; a compression unit which generates first compressed userpreference data having less data item number from the first userpreference data by utilizing a first set of parameters; a storage unitwhich stores a plurality of data having the same data item number asthat of the first compressed user preference data, the plurality of databeing generated from second user preference data or content propertydata belonging to a second data space which is different from the firstdata space by utilizing a second set of parameters; and a selection unitwhich selects at least single data from the plurality of data stored atthe storage unit according to similarity degree to the first compresseduser preference data generated by the compression unit, wherein theplurality of data stored at the storage unit are respective datapreviously generated by utilizing the second set of parameters which islearned so that difference between the first compressed user preferencedata and second compressed user preference data generated from thesecond user preference data of a common user is to be small across aplurality of users.

According to another embodiment of the present invention, there isprovided an information processing apparatus, including: a first dataacquisition unit which acquires first user preference data belonging toa first data space; a second data acquisition unit which acquires seconduser preference data belonging to a second data space which is differentfrom the first data space; a first compression unit which generatesfirst compressed user preference data having less data item number fromthe first user preference data by utilizing a first set of parametersand stores the first compressed user preference data at a recordingmedium; and a second compression unit which generates second compresseduser preference data having the same data item number as that of thefirst compressed user preference data from the second user preferencedata by utilizing a second set of parameters and stores the secondcompressed user preference data at a recording medium, wherein the firstset of parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.

According to another embodiment of the present invention, there isprovided a data accumulation method, including the steps of: acquiringfirst user preference data belonging to a first data space; acquiringsecond user preference data belonging to a second data space which isdifferent from the first data space; generating first compressed userpreference data having less data item number from the first userpreference data by utilizing a first set of parameters and storing thefirst compressed user preference data at a recording medium; andgenerating second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromthe second user preference data by utilizing a second set of parametersand storing the second compressed user preference data at a recordingmedium, wherein the first set of parameters or the second set ofparameters is a set of parameters which is learned so that differencebetween the first compressed user preference data and the secondcompressed user preference data of a common user is to be small across aplurality of users.

According to another embodiment of the present invention, there isprovided a program for causing a computer which controls an informationprocessing apparatus to perform functions including: a first dataacquisition unit which acquires first user preference data belonging toa first data space; a second data acquisition unit which acquires seconduser preference data belonging to a second data space which is differentfrom the first data space; a first compression unit which generatesfirst compressed user preference data having less data item number fromthe first user preference data by utilizing a first set of parametersand stores the first compressed user preference data at a recordingmedium; and a second compression unit which generates second compresseduser preference data having the same data item number as that of thefirst compressed user preference data from the second user preferencedata by utilizing a second set of parameters and stores the secondcompressed user preference data at a recording medium, wherein the firstset of parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.

According to another embodiment of the present invention, there isprovided an information processing apparatus, including: a storage unitwhich stores a first set of parameters to generate first compressed userpreference data having less data item number from first user preferencedata belonging to a first data space and a second set of parameters togenerate second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromsecond user preference data belonging to a second data space which isdifferent from the first data space; and a conversion unit whichconverts the first user preference data into the second user preferencedata based on the first set of parameters and the second set ofparameters which are stored at the storage unit, wherein the first setof parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.

The conversion unit may convert the first user preference data into thesecond user preference data in accordance with correspondence betweendata items of the first user preference data and data items of thesecond user preference data, the correspondence being determinedaccording to similarity degree of parameter values of each data itembetween the first set of parameters and the second set of parameters.

The information processing apparatus may further include: a compressionunit which generates the first compressed user preference data from thefirst user preference data by utilizing the first set of parameters, andthe conversion unit may determine likely second user preference datacapable of generating the first compressed user preference datagenerated by the compression unit by utilizing the second set ofparameters as the second user preference data converted from the firstuser preference data.

According to another embodiment of the present invention, there isprovided a data conversion method, including: a step of converting firstuser preference data into second user preference data based on a firstset of parameters to generate first compressed user preference datahaving less data item number from the first user preference databelonging to a first data space and a second set of parameters togenerate second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromthe second user preference data belonging to a second data space whichis different from the first data space, wherein the first set ofparameters or the second set of parameters is a set of parameters whichis learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.

According to another embodiment of the present invention, there isprovided a program for causing a computer which controls an informationprocessing apparatus to perform functions including: a storage unitwhich stores a first set of parameters to generate first compressed userpreference data having less data item number from first user preferencedata belonging to a first data space and a second set of parameters togenerate second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromsecond user preference data belonging to a second data space which isdifferent from the first data space; and a conversion unit whichconverts the first user preference data into the second user preferencedata based on the first set of parameters and the second set ofparameters which are stored at the storage unit, wherein the first setof parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.

As described above, according to the present invention, there can beprovided a learning apparatus, a learning method, an informationprocessing apparatus, a data selection method, a data accumulationmethod, a data conversion method and a program capable of commonlymanaging user preferences or content properties among domains ofdifferent data spaces.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic view which illustrates an outline of a system towhich a recommendation technology associated with an embodiment isapplied;

FIG. 2 is a block diagram which illustrates an example of a specificconfiguration of the terminal device of FIG. 1;

FIG. 3 is a block diagram which illustrates an example of a specificconfiguration of the information processing apparatus of FIG. 1;

FIG. 4 is an explanatory view which illustrates a calculating process tocalculate user preference data from content property data;

FIG. 5 is an explanatory view which illustrates a process to compressdata in accordance with a multi-topic model;

FIG. 6 is an explanatory view which illustrates an example of compresseduser preference data and compressed content property data;

FIG. 7 is a flowchart which describes an example of the flow of arecommendation process associated with an embodiment;

FIG. 8 is an explanatory view which illustrates an outline of a learningprocess according to an embodiment;

FIG. 9 is a block diagram which illustrates an example of theconfiguration of a learning apparatus according to an embodiment;

FIG. 10 is a block diagram which illustrates an example of theconfiguration of the information processing apparatus to perform therecommendation process according to an embodiment;

FIG. 11 is a schematic view which illustrates an outline of a system toperform a data accumulation process according to an embodiment;

FIG. 12 is a block diagram which illustrates an example of a specificconfiguration of the terminal device of FIG. 11;

FIG. 13 is a block diagram which illustrates an example of a specificconfiguration of the other terminal device of FIG. 11;

FIG. 14 is a block diagram which illustrates an example of a specificconfiguration of the information processing apparatus of FIG. 11;

FIG. 15 is a schematic view which illustrates an outline of the systemto perform a data conversion process according to an embodiment;

FIG. 16 is a block diagram which illustrates an example of a specificconfiguration of the information processing apparatus of FIG. 15;

FIG. 17 is an explanatory view which illustrates a determination processof correspondence among data items;

FIG. 18 is an explanatory view which illustrates a conversion process ofthe user preference data;

FIG. 19 is a block diagram which illustrates a specific configurationaccording to a modified example of a data conversion apparatus; and

FIG. 20 is a block diagram which illustrates a hardware configuration ofa general purpose computer.

DETAILED DESCRIPTION OF EMBODIMENT(S)

Hereinafter, preferred embodiments of the present invention will bedescribed in detail with reference to the appended drawings. Note that,in this specification and the appended drawings, structural elementsthat have substantially the same function and structure are denoted withthe same reference numerals, and repeated explanation of thesestructural elements is omitted.

Hereinafter, preferred embodiments of the present invention will bedescribed in the following order.

1. Description of Related Art

2. Description of a Learning Apparatus According to an Embodiment

3. Description of a Recommendation Apparatus According to an Embodiment

4. Description of a Data Accumulation Apparatus According to anEmbodiment

5. Description of a Data Conversion Apparatus According to an Embodiment

6. Summary

1. Description of Related Art

First, a recommendation technology associated with a later-mentionedembodiment of the present invention will be described with reference toFIGS. 1 to 7.

FIG. 1 is a schematic view which illustrates an outline of aninformation processing system 1 to which a recommendation technologyassociated with an embodiment of the present invention is applied. Asillustrated in FIG. 1, the information processing system 1 includes aterminal device 10, a network 20 and an information processing apparatus30.

The terminal device 10 is used by a user to receive offering of arecommendation service from the information processing apparatus 30. Forexample, the terminal device 10 may be an information processingterminal such as a personal computer (PC) and a personal digitalassistant (PDA), a cellular phone terminal, a game terminal, a digitalhome appliance such as a music player and a television set, and thelike.

FIG. 2 is a block diagram which illustrates an example of a furtherspecific configuration of the terminal device 10. As illustrated in FIG.2, the terminal device 10 includes a user interface unit 12 and aprocessing unit 14.

For example, the user interface unit 12 provides display means todisplay information for a user by the terminal device 10 and input meansto input information to the terminal device 10 by the user. For example,the display means corresponds to a display device such as CRT, PDP, LCDand OLED. Further, for example, the input means corresponds to a mouse,a keyboard, a touch panel, a button, a switch or the like.

For example, the processing unit 14 may be a browser which acquires awebpage on the internet and provides the webpage to a user for reading.In that case, a request of the user for the user's action such aspurchasing or viewing/listening of a content on the internet istransmitted from the processing unit 14 to the information processingapparatus 30. Further, the processing unit 14 may be an applicationwhich replays or executes the content at the terminal device 10. In thatcase, the information regarding the action of replaying or executing ofthe content by the user is transmitted from the processing unit 14 tothe information processing apparatus 30. Here, the processing unit 14may temporally store a history of the user's individual actions insidethe terminal device 10 and transmit the accumulated action histories tothe information processing apparatus 30 at predetermined timing.

Referring again to FIG. 1, description of the information processingsystem 1 will be continued.

The network 20 connects the terminal device 10 and the informationprocessing apparatus 30. The network 20 may be an arbitrary network suchas the internet, a wired or wireless local area network (LAN), a widearea network (WAN), a lease line and a virtual private network.

The information processing apparatus 30 provides a recommendationservice to the user of the terminal device 10. For example, theinformation processing apparatus 30 is configured as a computeraccessible to a storage device which stores content property data anduser preference data. For example, the information processing apparatus30 may be a server device which provides the recommendation service.Further, the information processing apparatus 30 may be a PC, aworkstation, the abovementioned digital appliance or the like.

FIG. 3 is a block diagram which illustrates an example of a furtherspecific configuration of the information processing apparatus 30. Asillustrated in FIG. 3, the information processing apparatus 30 includesa user preference acquisition unit 32, a compression unit 34, arecommendation unit 36 and a storage unit 40. Further, the storage unit40 includes a user preference database (DB) 42, a content property DB44, a compressed user preference DB 46, a compressed content property DB48 and a parameter DB 50.

The user preference acquisition unit 32 acquires user preference dataindicating preference of a user, for example, in accordance with anaction or an action history which is transmitted from the terminaldevice 10. For example, the user preference data to be applied tocontent-based filtering may be expressed by the linear sum of thecontent property data in a data space corresponding to a content domainof the contents which were the subjects of the user's actions.

FIG. 4 is an explanatory view which illustrates an example of acalculation process to calculate the user preference data from thecontent property data stored at the content property DB 44 of thestorage unit 40.

In the example of FIG. 4, the content property DB 44 includes threepieces of the content property data indicated as identifiers of C01, C02and C03. Each content property data has five data items of properties Ato E. Namely, the data space of the content property data in this caseis a vector space having five dimensions corresponding to properties Ato E. In this vector space, the content property data of the content C01is expressed as a vector of (1, 0, 0, 1, 0), for example. Then, thecontent property data of the contents C02, C03 are respectivelyexpressed as vectors of (0, 0, 1, 0, 1) and (0, 0, 1, 0, 0). Further, inFIG. 4, a weight used for calculating the user preference data isrespectively defined for the content property data.

In the example of FIG. 4, the data space of the user preference data hasfive data items of properties A to E as well. The value of each of thefive data items is calculated, for example, as the weighted linear sumof the content property data stored at the content property DB 44 of thecontent which is the subject of the user's action. For example, assumingthat a user identified with an identifier U01 has used the contents C01,C02 and C03. In this case, the user preference data of the user U01 havefollowing values. The value of property A is to be 0.4(=1×0.4+0×0.9+0×0.4). The value of property B is to be 0.0(=0×0.4+0×0.9+0×0.4). The value of property C is to be 1.3(=0×0.4+1×0.9+1×0.4). The value of property D is to be 0.4(=1×0.4+0×0.9+0×0.4). Then, the value of property E is to be 0.9(=0×0.4+1×0.9+0×0.4). Accordingly, the user preference data of the userU01 is expressed as (0.4, 0.0, 1.3, 0.4, 0.9) in the vector space havingproperties A to E as the elements thereof. The user preferenceacquisition unit 32 calculates such user preference data and stores atthe user preference DB 42.

In the example of the above description, the user preference data iscalculated as the weighted linear sum of the content property data.However, weighting may not be necessarily performed for calculating thelinear sum. Further, the user preference data may be calculated by amethod other than the linear sum in the data space being common to thecontent property data.

Here, only the five data items of properties A to E are described in theexample of FIG. 4. However, in general, the data space of practicalcontent property data (and the user preference data) has higherdimensions. For example, the content property data is given in advanceby analyzing text of content description which explains the content witha method of term frequency (TF)/inverse document frequency (IDF) and thelike or by analyzing audio and/or video of the content data itself. Suchcontent property data is apt to be a sparse vector having values of zeroor empty at many data items in a high-dimensional vector space. This isdisadvantageous for the recommendation algorithm such as thecontent-based filtering in view of process cost, accuracy ofrecommendation result and the like. Accordingly, the content propertydata or the user preference data is compressed to a lower-dimensionaldata by the compression unit 34 of FIG. 3.

In FIG. 3, the compression unit 34 generates the compressed userpreference data having less number of data items from the userpreference data by utilizing a set of parameters which is stored at theparameter DB 50. Further, the compression unit 34 can generate thecompressed content property data having less number of data items fromthe content property data by utilizing a set of parameters which isstored at the parameter DB 50 as well.

The compression of the user preference data by the compression unit 34may be performed in accordance with the concept of a multi-topic model,for example. The multi-topic model is a probabilistic model whichutilizes probability distributions in a data-intrinsic topic space andprobability distributions which are respectively allocated for topics ina metadata space. Specifically, a plurality of variations of theprobability models have been proposed in Thomas Hofmann, ‘ProbabilisticPotential Semantic Indexing’, Proceedings of the twenty-second AnnualInternational SIGIR Conference on Research and Development inInformation Retrieval, 1999, David M. Blei, Andrew Y. Ng, Michael I.Jordan, ‘Potential Dirichlet Allocation’, Journal of Machine LearningResearch 3, 2003, and the like. Following are main portions thereof tobe associated with the present invention.

First, in the multi-topic model, a plurality of topics is defined asactual values of potential discrete probability variables which may notbe directly observed. A probability distribution of a metadata space isallocated to each topic. The probability distribution of the metadataspace allocated to each topic is called the topic intrinsicdistribution.

By applying the concept of the multi-topic model to the recommendationsystem, the topic intrinsic distribution is estimated in advance bystatistical learning with a group of the content property data and/orthe user preference data as learning data. Then, each content propertydata or each user preference data has intrinsic probability distributionwhich is called the topic distribution in the topic space as being thepotential variable space. For example, the probability distribution inthe metadata space of each content property data or each user preferencedata is acquired by averaging the topic intrinsic distribution byutilizing the topic distribution. Herein, the metadata space may be thevector space having properties A to E shown in FIG. 4 as the elementsthereof and the like.

In general, the dimension of the parameters of the topic distribution islow. Further, since the topic distribution is determined for eachcontent property data or for each user preference data, the parametersof the topic distribution generated from the content property data orthe user preference data can be regarded as the data compressed indimension of each data. Therefore, in this specification, the parametersof the topic distribution corresponding to the content property data iscalled the compressed content property data, and the parameters of thetopic distribution corresponding to the user preference data is calledthe compressed user preference data.

Such compressed content property data or compressed user preference dataare capable of being calculated once topic intrinsic distributions aredetermined. That is, the content property data or the user preferencedata which is the high-dimensional sparse vector can be compressed to below dimensional in accordance with the multi-topic model.

The compression unit 34 of FIG. 3 can generate the compressed userpreference data as the topic distribution corresponding to the userpreference data by utilizing the parameters of the topic intrinsicdistribution which is previously determined by learning in accordancewith the concept of the multi-topic model, for example. In this case,the parameters of the topic intrinsic distribution are previously storedat the parameter DB 50 of FIG. 3. Hereinafter, in this specification,the parameters of the topic intrinsic distribution used for thecompression of the user preference data or the content property data arecalled the model parameters. Note that, not limited to the parameters ofthe topic intrinsic distribution of the multi-topic model, theparameters used for the compression may be other arbitrary parameters.

FIG. 5 is an explanatory view which further illustrates a process togenerate the compressed user preference data having less number of thedata items from the user preference data in accordance with the conceptof the multi-topic model.

As illustrated in FIG. 5, first, the user preference data UP is providedin the data space D as the vector space with N pieces of elements ofproperties 1 to N. Further, k pieces of the model parameters P_(i)(x)(i=1−k) which are previously determined in the data space D by learningis provided as well. The model parameters P_(i)(x) (i=1−k) correspond tothe probability distribution respectively corresponding to k pieces oftopics in the data space D, namely, correspond to the topic intrinsicdistribution. As described above, in general, k is smaller than N. Here,when the occurrence probability of a predetermined data value of theuser preference data UP in the data space D is to be P(x), P(x) isexpressed by the following equation by utilizing k pieces of the modelparameters P_(i)(x) (i=1−k).

[Math 1]

P(x)=w ₁ P ₁(x)+w ₂ P ₂(x)+ . . . +w _(k) P _(k)(x)  (1)

Herein, the parameters of the topic distribution corresponding to theuser preference data UP are expressed by w_(i) (i=1−k). Namely, eachw_(i) corresponds to a topic mixing ratio. By utilizing the topic mixingratios w_(i) (i=1−k), the compressed user preference data UP' iscalculated as the vector with the elements of the topic mixing ratios(w₁, w₂, . . . , w_(k)). Note that such data compression can besimilarly performed on the content property data instead of the userpreference data.

Referring to FIG. 3 once more, the description of the configuration ofthe information processing apparatus 30 will be continued.

The recommendation unit 36 of the information processing apparatus 30specifies the content suitable for the user by utilizing the compresseduser preference data and the compressed content property data of whichdimensions are compressed (i.e., of which data item number is reduced)by the compression unit 34 as described above, and then, transmits tothe terminal device 10 as the recommendation result.

FIG. 6 is an explanatory view which illustrates a data example of thecompressed user preference data and the compressed content propertydata.

As illustrated in FIG. 6, the user preference DB 42 includes two piecesof user preference data for the users U01, U02 in the data space of fivedimensions of properties A to E. Meanwhile, the compressed userpreference DB 46 includes two pieces of compressed user preference datawhich are respectively compressed into three dimensions. By compressingthe user preference data as described above, there is a case where twopieces of user preference data of which similarity degree is low beforethe compression indicate high similarity degree after the compression.This is a phenomenon occurring because property values of different dataitems having potential relativity are aggregated into one value of thetopic mixing ratio by utilizing the multi-topic model. Accordingly, thepotential similarity between users can be considered for therecommendation, so that the content which is more suitable for the userpreference is recommended. This is similar to the case where the userpreference data and the content property data are to be compared.

In the example of FIG. 6, the content property DB 44 includes two piecesof content property data for the contents C01, C02 in the data space offive dimensions of properties A to E. Meanwhile, the compressed contentproperty DB 48 includes two pieces of compressed content property datawhich are respectively compressed into three dimensions. The compressedcontent property data are previously generated by the compression unit34 and stored at the compressed content property DB 48. Therecommendation unit 36 of FIG. 3 specifies the content as the subjectfor recommendation in the procedure of FIG. 7 by utilizing thecompressed user preference data and the compressed content propertydata.

FIG. 7 is a flowchart which describes an example of the flow of therecommendation process by the recommendation unit 36.

As illustrated in FIG. 7, first, the recommendation unit 36 reads thecompressed user preference data of the target user for supplying therecommendation service (S2) from the compressed user preference DB 46.Next, the recommendation unit 36 read the compressed content propertydata from the compressed content property DB 48 (S4). The compressedcontent property data to be read may be a part of the data extracted bypredetermined extraction conditions, for example. Next, therecommendation unit 36 calculates the similarity degree between thecompressed user preference data which is read in step S2 and thecompressed content property data which is read in step S4 (S6). Notethat, the similarity degree may be standard inner product between thevectors, sign-reversed Euclidean distance, cosine distance or the like.Then, the recommendation unit 36 generates a list of a predeterminednumber of contents in a descending order of the calculated similaritydegree, for example, and transmits the generated list to the terminaldevice 10 as the recommendation result (S8).

In the example of the above description, the recommendation unit 36performs the recommendation process based on the content-basedfiltering. However, it is possible to obtain the effect of theabovementioned data compression even in a case where the recommendationunit 36 performs the recommendation process based on another algorithmsuch as the collaborative filtering.

Up to this point, the recommendation technology associated with anembodiment of the present invention has been described with reference toFIGS. 1 to 7. With such recommendation technology, the recommendation isperformed after the dimension of the user preference data or the contentproperty data which belongs in the high-dimensional data space iscompressed in accordance with the multi-topic model. Therefore, therecommendation is preformed more suitably for the user, so thatfreshness and range of the recommended content can be improved.

In such related technology, the model parameters used for thecompression of the user preference data or the content property data bythe compression unit 34 of FIG. 3 are previously determined by learningfor each data space to which the user preference data or the contentproperty data belongs. Therefore, when the data spaces including theuser preference data or the content property data are different fromeach other, the recommendation may not be performed by comparing thecompressed data in respective data spaces with each other. On thecontrary, by utilizing the learning method which will be described inthe following, it becomes possible to perform cross-domainrecommendation by utilizing the user preference data or the contentproperty data of the different data spaces.

2. Description of Learning Process According to an Embodiment Outline ofLearning Apparatus

FIG. 8 is an explanatory view which illustrates an outline of a learningapparatus according to an embodiment of the present invention.

In FIG. 8, two different data spaces D1, D2 are illustrated. The dataspace D1 is for a content domain of books, for example. In the dataspace D1, n pieces of data items A₁-A_(n) are included. For example, thedata item A₁ is to be “Author X”, the data item A₂ is to be “Author Y”,. . . , and the data item A_(n) is to be “Genre Z”. Meanwhile, the dataspace D2 is for a content domain of television programs (TV programs),for example. In the data space D2, m pieces of data items B₁-B_(m) areincluded. For example, the data item B₁ is to be “Talent α”, the dataitem B₂ is to be “Talent β”, . . . , and the data item B_(m) is to be“Time zone ω”.

Here, it is assumed that the user preference data of the common user U1are respectively UP1=(2, 1, . . . , 1) and UP2=(0, 2, . . . , 0) in thedata spaces D1, D2. With this assumption the dimensions of the userpreference data UP1 and UP2 are different from each other and themeanings of respective elements are not associated with each other, asthey are. Therefore, the user preference data UP1 and UP2 may not bedirectly compared to each other. Then, first, it is assumed thatcompressed user preference data UP1′ is generated from the userpreference data UP1 by utilizing the model parameters P1 _(i). Here, thedimension of the compressed user preference data UP1′ corresponds to thenumber of the topic mixing ratios w₁-w_(k), namely, the number of themodel parameters P1 _(i). Meanwhile, by generating the compressed userpreference data UP2′ from the user preference data UP2 utilizing thesame number of model parameters P2 _(j) as the model parameters P1 _(i),the dimension of the compressed user preference data UP2′ is to be thesame as the dimension of the compressed user preference data UP1′.Further, the model parameters P1 i or the model parameters P2 _(j) aredetermined so as to make the sets of topic mixing ratios w₁-w_(k)generated from the user preference data UP1 and UP2 are equal or atleast have smaller differences across a plurality of common users. As aresult, it becomes possible to compare the compressed user preferencedata UP1′ generated from the user preference data UP1 and the compresseduser preference data UP2′ generated from the user preference data UP2with each other.

The learning apparatus 100 according to an embodiment described in thefollowing determines the model parameters P2 _(j) by learning among themodel parameters P1 _(i) and the model parameters P2 _(j).

[Configuration of Learning Apparatus]

FIG. 9 is a block diagram which illustrates the logical configuration ofthe learning apparatus 100. As illustrated in FIG. 9, the learningapparatus 100 includes a first data acquisition unit 120, a compressionunit 122, a learning unit 130 and a second data acquisition unit 140.Further, the learning apparatus 100 includes a first user preference DB110, a second user preference DB 112, a first parameter DB 114, a firstcompressed user preference DB 124 and a second parameter DB 132.

When a learning process is performed by the learning apparatus 100, thedata to be used for the learning are previously prepared respectively atthe first user preference DB 110, the second data preference DB 112 andthe first parameter DB 114. The first user preference DB 110 is preparedwith a plurality of first user preference data which belong to the dataspace D1 of FIG. 8, for example. Further, the second user preference DB112 is prepared with a plurality of second user preference data whichbelong to the data space D2 being different from the data space D1 forthe common user with the first user preference data. Furthermore, thefirst parameter DB 114 is prepared with k pieces of the model parametersP1 _(i) (i=1−k) used for generating the first compressed user preferencedata having less data items from the first user preference data.

When the learning process with the learning apparatus 100 is started,first, the first data acquisition unit 120 acquires the first userpreference data belonging to the data space D1 from the first userpreference DB 110 and outputs the acquired data to the compression unit122. Next, the compression unit 122 compresses the first user preferencedata by utilizing k pieces of the model parameters P1 _(i) which areprepared at the first parameter DB 114 and generates the firstcompressed user preference data. The first compressed user preferencedata generated by the compression unit 122 is stored at the firstcompressed user preference DB 124. The first compressed user preferencedata generated at that time is managed as training data of the secondcompressed user preference data by the later-mentioned learning unit130.

Meanwhile, the second data acquisition unit 140 acquires the second userpreference data belonging to the data space D2 from the second userpreference DB 112 and outputs to the learning unit 130. Then, thelearning unit 130 reads the first compressed user preference data of thecommon user from the first compressed user preference DB 124 and regardsthe first compressed user preference data as the compression result(i.e., the training data) of the case where the second user preferencedata is compressed. Then, the learning unit 130 determines k pieces ofthe model parameters P2 _(j) (j=1−k) by learning for generating thetraining data from the abovementioned second user preference data andstored at the second parameter DB 132. By performing such a learningprocess for sufficient number of users, the difference between the firstcompressed user preference data generated by the compression unit 122and the second compressed user preference data can be reduced even for anew user.

Note that, in the example of the above description, the model parametersP1 _(i) for the data space D1 are fixed values and the model parametersP2 _(j) for the data space D2 are to be learned. However, it is alsopossible that the model parameters P1 _(i) and the model parameters P2_(j) are simultaneously determined by learning.

3. Description of Recommendation Apparatus According to an Embodiment

By utilizing the model parameters P1 _(i) and P2 _(j) determined by thelearning apparatus 100, the user preference data and the contentproperty data belonging to different data spaces D1, D2 can becompressed into the data belonging to the common compressed data spaceD′ as illustrated in FIG. 8. That is, it becomes possible to performcross-domain recommendation across a plurality of domains by utilizingthe user preference data or the content property data provided atdifferent content domains. Therefore, the information processingapparatus (i.e., the recommendation apparatus) to perform recommendationof the content by utilizing the model parameter P1 _(i) and the modelparameters P2 _(j) which are determined by the abovementioned learningapparatus 100 will be described in the following.

FIG. 10 is a block diagram which illustrates the logical configurationof the information processing apparatus 200 to perform therecommendation process according to an embodiment of the presentinvention. As illustrated in FIG. 10, the information processingapparatus 200 includes a data acquisition unit 210, a compression unit220, a recommendation unit 230 and a storage unit 240. Then, therecommendation unit 230 includes a selection unit 232 and a transmissionunit 234. Further, the storage unit 240 includes a user preference DB242, a content property DB 244, a compressed user preference DB 246, acompressed content property DB 248 and a parameter DB 250.

The data acquisition unit 210 acquires the first user preference databelonging to the data space D1 which corresponds to the first contentdomain. More specifically, the data acquisition unit 210 may calculateand acquire the first user preference data by utilizing the firstcontent property data which is previously stored at the content propertyDB 244, for example, based on the action or the action history of theuser regarding the first content domain. Instead, the data acquisitionunit 210 may acquire the previously-calculated first user preferencedata from an internal or external database of the information processingapparatus 200 and the like. The data acquisition unit 210 stores theacquired first user preference data at the user preference DB 242.

The compression unit 220 generates the first compressed user preferencedata having less number of data items from the first user preferencedata by utilizing the first set of parameters. More specifically, thecompression unit 220 may generate the first compressed user preferencedata in accordance with the concept of the multi-topic model describedwith reference to FIG. 5, for example, by utilizing the first set ofmodel parameters P1 _(i) stored at the parameter DB 250. The compressionunit 220 stores the generated first compressed user preference data atthe compressed user preference DB 246.

The second compressed user preference data regarding a plurality ofusers generated from the second data preference data belonging to thedata space D2 being different from the data space D1 by utilizing thesecond set of model parameters P2 _(j) are previously stored at thecompressed user preference DB 246 of the storage unit 240. Further, thesecond compressed content property data regarding a plurality ofcontents generated from the second content property data belonging tothe data space D2 by utilizing the second set of model parameters P2_(j) are previously stored at the compressed content property DB 248.Here, the second set of model parameters P2 _(j) utilized for generatingthe second compressed user preference data and the second compressedcontent property data is the set of parameters which is previouslylearned so that the difference between the first compressed userpreference data and the second compressed user preference data is to besmall across the plurality of users.

The selection unit 232 of the recommendation unit 230 selects at leastsingle data from among the abovementioned plurality of the secondcompressed user preference data or the second compressed contentproperty data according to the similarity degree to the first compresseduser preference data generated by the compression unit 220. Morespecifically, the selection unit 232 may select the second compressedcontent property data having high similarity degree to the firstcompressed user preference data generated by the compression unit 220,for example, in accordance with the concept of content-based filtering.Here, the similarity degree may be calculated as standard inner productbetween the vectors, sign-reversed Euclidean distance, cosine distanceor the like, for example. Then, the selection unit 232 outputs a contentidentifier and the like corresponding to at least one of the selectedcompressed content property data to the transmission unit 234. Further,the selection unit 232 may select the second compressed user preferencedata having high similarity degree to the first compressed userpreference data generated by the compression unit 220, for example, inaccordance with the concept of collaborative filtering. In this case,the selection unit 232 outputs a content identifier and like specifyinga content used in the past by the user corresponding to the secondcompressed user preference data selected by the selection unit 232, forexample.

The transmission unit 234 generates the recommendation result (i.e., acontent identifier list, a webpage displaying the recommendation resultor the like), for example, in accordance with the content identifierinputted from the selection unit 232 and transmits the generated resultto an external device such as the terminal device 10 of FIG. 1.

As conceived with the above description, the information processingapparatus 200 can recommend a content of the second content domaincorresponding to the data space D2 by utilizing the first userpreference data in the data space D1 corresponding to the first contentdomain. This recommendation process is performed by learning the modelparameters capable of compressing the data in the two different dataspaces D1, D2 into the data in the common compressed data space D′. Withthis recommendation process by the information processing apparatus 200,it becomes possible to recommend the content suitable for the user in across-domain manner across a variety of content domains such as music,video, books and news articles, for example.

The data spaces D1, D2 are not limited to data spaces corresponding todifferent content domains. Namely, for example, the data spaces D1, D2may be data spaces having different data items defined for the samecontent domain. For example, in a case where types or manufacturingvendors of devices such as PCs and music players which generate the userpreference data are different, the data spaces of the user preferencedata may be considered to be different even with the user preferencedata in the same music domain. In this case also, with the informationprocessing apparatus 200 according to the present embodiment, it becomespossible to perform the recommendation based on the user preference datagenerated at a data space by utilizing the user preference data or thecontent property data belonging to another data space.

4. Description of Data Accumulation Apparatus According to an Embodiment

Further, when the model parameters P1 _(i), P2 _(j) determined by thelearning apparatus 100 are utilized, the user preference data and thecontent property data generated in the different data spaces D1, D2 canbe accumulated as being compressed into the data in the commoncompressed data space D′. Accordingly, the user preference datadispersed as the data in different data spaces at various devices can beaccumulated at the single data accumulation apparatus so that theprocess result of the recommendation process is improved in accuracy.Then, the information processing apparatus (i.e., the data accumulationapparatus) capable of accumulating the user preference data generated atthe plurality of devices as the data of the single compressed data spaceby utilizing the model parameters which are determined by theabovementioned learning apparatus 100 will be described in thefollowing.

FIG. 11 is a schematic view which illustrates an outline of aninformation processing system 300 utilizing the data accumulationapparatus according to an embodiment of the present invention. Asillustrated in FIG. 11, the information processing system 300 includesterminal devices 310, 320 and an information processing apparatus 330.

A music player is illustrated in FIG. 11 as an example of the terminaldevice 310. However, not limited to this example, the terminal device310 may be an arbitrary device. Similarly, a television set isillustrated as an example of the terminal device 320. However, notlimited to this example, the terminal device 320 may be an arbitrarydevice. The terminal devices 310, 320 respectively generate userpreference data belonging to different data spaces and transmit thegenerated data to the information processing apparatus 330.

FIG. 12 is a block diagram which illustrates the logical configurationof the terminal device 310. As illustrated in FIG. 12, the terminaldevice 310 includes a first application unit 311, a first datageneration unit 312, a first user preference DB 313, a first contentproperty DB 314 and a first data transmission unit 315.

The first application unit 311 manages the content which is to be asubject of the user's action using the terminal device 310. Namely, thelater-mentioned first user preference data is generated in accordancewith the user's action such as replay or execution of the contentutilizing the first application unit 311. The information regarding theuser's action utilizing the first application unit 311 is outputted tothe first data generation unit 312.

When the information regarding the abovementioned user's action isreceived from the first application unit 311, the first data generationunit 312 generates the first user preference data belonging to the firstdata space by utilizing the first content property data previouslystored at the first content property DB 314. The first data space is,for example, the data space corresponding to the music domain in thecase where the first application unit 311 is the application for musicreplay. The generation process of the first user preference data by thefirst data generation unit 312 may be the process in accordance with thecontent-based filtering described with reference to FIG. 4, for example.The first data generation unit 312 stores the generated first userpreference data at the first user preference DB 313.

The first data transmission unit 315 acquires the first user preferencedata generated by the first data generation unit 312 from the first userpreference DB 313, for example, and transmits the acquired data to theinformation processing apparatus 330 of FIG. 11. Further, the first datatransmission unit 315 may transmits, to the information processingapparatus 330, the first content property data belonging to the firstdata space acquired from the first content property DB 314. The datatransmission process to the information processing apparatus 330 by thefirst data transmission unit 315 may be performed, for example, when auser uses the first application unit 311 or at previously determinedspecific time intervals.

FIG. 13 is a block diagram which illustrates the logical configurationof the terminal device 320. As illustrated in FIG. 13, the terminaldevice 320 includes a second application unit 321, a second datageneration unit 322, a second user preference DB 323, a second contentproperty DB 324 and a second data transmission unit 325.

The second application unit 321 manages the content which is to be asubject of the user's action using the terminal device 320. Namely, thelater-mentioned second user preference data is generated in accordancewith the user's action such as replay or execution of the contentutilizing the second application unit 321. The information regarding theuser's action utilizing the second application unit 321 is outputted tothe second data generation unit 322.

When the information regarding the abovementioned user's action isreceived from the second application unit 321, the second datageneration unit 322 generates the second user preference data belongingto the second data space by utilizing the second content property datapreviously stored at the second content property DB 324. The second dataspace is, for example, the data space corresponding to a televisionprogram domain in the case where the second application unit 321 is atelevision set to display television programs. The generation process ofthe second user preference data by the second data generation unit 322may be the process in accordance with the content-based filteringdescribed with reference to FIG. 4, for example. The second datageneration unit 322 stores the generated second user preference data atthe second user preference DB 323.

The second data transmission unit 325 acquires the second userpreference data generated by the second data generation unit 322 fromthe second user preference DB 323, for example, and transmits theacquired data to the information processing apparatus 330 of FIG. 11.Further, the second data transmission unit 325 may transmit, to theinformation processing apparatus 330, the second content property databelonging to the second data space acquired from the second contentproperty DB 324. Similarly to the transmission process by theabove-mentioned first data transmission unit 315, the data transmissionprocess to the information processing apparatus 330 by the second datatransmission unit 325 may be performed, for example, when a user usesthe second application unit 321 or at previously determined specifictime intervals.

FIG. 14 is a block diagram which illustrates the logical configurationof the information processing apparatus 330. As illustrated in FIG. 14,the information processing apparatus 330 includes a data reception unit332, a first data acquisition unit 334, a second data acquisition unit336, a first compression unit 338 and a second compression unit 340.Further, the information processing apparatus 330 includes an identifieridentification DB 350, a parameter DB 352, a compressed user preferenceDB 354 and a compressed content property DB 356.

The data reception unit 332 receives the user preference data or thecontent property data transmitted from the terminal device 310 and theterminal device 320 which are described above. Herein, the useridentifiers contained in the user preference data or the contentidentifiers contained in the content property data transmitted from theterminal device 310 and the terminal device 320 are not always unifiedbetween the terminal devices. Accordingly, the data reception unit 332identifies the user or the content to be related to the received data byutilizing an identifier correspondence table which is previously storedat the identifier identification DB 350. Then, the data reception unit332 outputs the received data to the first data acquisition unit 334 orthe second data acquisition unit 336 in accordance with the data spacecorresponding to the identified user or content.

The first data acquisition unit 334 acquires the first user preferencedata or the first content property data belonging to the first dataspace among the data received by the data reception unit 332 and outputsto the first compression unit 338. Meanwhile, the second dataacquisition unit 336 acquires the second user preference data or thesecond content property data among the data received by the datareception unit 332 and outputs to the second compression unit 340.

The first compression unit 338 generates the first compressed userpreference data having less data items, for example, from the first userpreference data inputted by the first data acquisition unit 334 byutilizing the first set of parameters which are previously stored at theparameter DB 352. Then, the first compression unit 338 outputs the firstcompressed user preference data generated thereby to the compressed userpreference DB 354 for storing. In addition, the first compression unit338 may compress and store the first content property data inputted fromthe first data acquisition unit 334 at the compressed content propertyDB 356.

The second compression unit 340 generates the second compressed userpreference data having less data items, for example, from the seconduser preference data inputted by the second data acquisition unit 336 byutilizing the second set of parameters which are previously stored atthe parameter DB 352. Then, the second compression unit 340 outputs thesecond compressed user preference data generated thereby to thecompressed user preference DB 354 for storing. In addition, the secondcompression unit 340 may compress and store the second content propertydata inputted from the second data acquisition unit 336 at thecompressed content property DB 356.

Here, the first and second sets of parameters which are previouslystored at the parameter DB 352 correspond to, for example, the modelparameters and the like of the multi-topic model determined through thelearning process by the abovementioned learning apparatus 100. Namely,the first and second sets of parameters according to the presentembodiment are learned so that the difference between the firstcompressed user preference data and the second compressed userpreference data of the common user becomes small across a plurality ofusers. Accordingly, with the information processing apparatus 330according to the present embodiment, the data generated respectively bythe terminal devices 310, 320 can be accumulated at each database as thedata belonging to the common compressed data space. Here, it is alsopossible to calculate the third compressed user preference data bysumming the first compressed user preference data and the secondcompressed user preference data of the common user after beingmultiplied by a predetermined ratio and to store the third compresseduser preference data at the database. The ratio to be multiplied to thefirst compressed user preference data and the second compressed userpreference data may be determined in accordance with the number ofhistories contained in the user history for each device, for example. Inthis manner, by accumulating the user preference data and the contentproperty data as the data in the common compressed data space, thedispersed data at various devices are aggregated into the singledatabase to be utilized effectively so that the accuracy of therecommendation process utilizing the data can be improved.

It should be noted that, in the example of the above description, theterminal device 310 and the terminal device 320 are of different types.However, the effects of the data accumulation by the informationprocessing apparatus 330 can be also expected even in the case where thedata spaces of the user preference data and the like are different dueto different vendors and handling languages for the terminal devices310, 320 of the same type, for example.

5. Description of Data Conversion Apparatus According to an Embodiment

The model parameters P1 _(i) and P2 _(j) determined by the learningapparatus 100 may be also considered to indicate features in the commoncompressed data space of the data items constituting each data space.Therefore, when the parameter values of the model parameters P1 _(i) andP2 _(j) corresponding to two data items which belong to the differentdata spaces are similar to each other, the two data items are consideredto have similarity therebetween. Accordingly, the user preference dataor the content property data generated in a data space can be mappedwith the data belonging to another data space based on the modelparameters P1 _(i), P2 _(j) which are determined by the learningapparatus 100. Thus, it is possible to circulate or reuse the userpreference data or the content property data by mutually converting inthe different data spaces. Consequently, it is possible to increaseopportunities of providing the recommendation service, for example.Therefore, the information processing apparatus (i.e., the dataconversion apparatus) capable of converting the user preference data orthe content property data generated in a data space into the databelonging to another data space will be described in the following.

FIG. 15 is a schematic view which illustrates an outline of aninformation processing system 400 utilizing the data conversionapparatus according to an embodiment of the present invention. Asillustrated in FIG. 15, the information processing apparatus 400includes recommendation devices 410, 420, terminal devices 412, 422 andan information processing apparatus 430.

The recommendation device 410 provides the recommendation service to theterminal device 412 by utilizing the first content property data and thefirst user preference data belonging to the data space D1. The terminaldevice 412 receives the recommendation result of the content of thecorresponding domain to the data space D1 from the recommendation device410 and proposes to the user.

Meanwhile, the recommendation device 420 provides the recommendationservice to the terminal device 422 by utilizing the second contentproperty data and the second user preference data belonging to the dataspace D2 which is different from the data space D1. The terminal device422 receives the recommendation result of the content of thecorresponding domain to the data space D2 from the recommendation device420 and proposes the received result to the user.

Between these two recommendation devices 410, 420, the informationprocessing apparatus 430 converts the first user preference databelonging to the first data space into the second user preference databelonging to the second data space which is different from the firstdata space.

FIG. 16 is a block diagram which illustrates an example of the logicalconfiguration of the information processing apparatus 430. Asillustrated in FIG. 16, the information processing apparatus 430includes a parameter DB 432, a mapping unit 434 and a conversion unit436.

The parameter DB 432 stores the first set of parameters to generate thefirst compressed user preference data from the first user preferencedata belonging to the data space D1 and the second set of parameters togenerate the second compressed user preference data from the second userpreference data belonging to the data space D2. The first and secondsets of parameters are learned by utilizing the abovementioned learningapparatus 100 so that the difference between the first compressed userpreference data and the second compressed user preference data of thecommon user becomes small across a plurality of users. The first andsecond sets of parameters may be respectively the model parameters inaccordance with the multi-topic model.

The mapping unit 434 determines the correspondence between the dataitems of the first user preference data and the data items of the seconduser preference data according to the similarity degree of the parametervalues for respective data items of the abovementioned first and secondsets of parameters acquired from the parameter DB 432.

FIG. 17 is an explanatory view which illustrates a determination processof the correspondence by the mapping unit 434.

Properties a1-aN as the data items of the data space D1 and propertiesb1-bN as the data items of the data space D2 are indicated in FIG. 17.Further, the first model parameters Pa_(i) (i=1−k) and the second modelparameters Pb_(j) (j=1−k) determined by the abovementioned learning inthe data spaces D1, D2 are indicated as well.

In FIG. 17, focusing on property a2 of the data space D1, for example,the feature related to the common compressed data space of property a2is indicated by the vector (0.1, 0.3, . . . , 0.1) having k pieces ofparameter values of the first model parameters Pa, as the elementsthereof. In this specification, the vector of which elements areparameter values of the model parameters in the case of focusing on aspecific data item is called the index of the data item. Accordingly,the indexes of N pieces of the data items are acquired for theN-dimensional data space D1 and the indexes of M pieces of the dataitems are acquired for the M-dimensional data space D2.

The mapping unit 434 acquires the indexes for each data item from thetwo different data spaces D1, D2 and calculates the index similaritydegree of respective data items. For example, the index similaritydegree may be standard inner product between the vectors, sign-reversedEuclidean distance, cosine distance or the like. Then, for each dataitem of the data space D2, for example, the mapping unit 434respectively determines the data item of the data space D1 having thehighest index similarity degree.

In the example of FIG. 17, the data item of the data space D1 having thehighest similarity degree to property b1 of the data space D2 isproperty a2, for example. The data item of the data space D1 having thehighest similarity with property b2 of the data space D2 is property a3.Further, the data item of the data space D1 having the highestsimilarity degree to property bM of the data space D2 is property a1.The mapping unit 434 outputs, to the conversion unit 436, theabove-determined correspondence between data items from the data spaceD1 to the data space D2, for example.

The conversion unit 436 converts the first user preference data receivedfrom the recommendation device 410 of FIG. 15, for example, into thesecond user preference data in accordance with the correspondence of thedata items of the data spaces D1, D2 determined by the mapping unit 434.

FIG. 18 is an explanatory view which describes the data conversionprocess by the conversion unit 436.

The correspondence of the data items of the data spaces D1, D2determined by the mapping unit 434 is indicated in FIG. 18. Here,property a2 of the data space D1 is related to property b1 of the dataspace D2, property a3 of the data space D1 is related to property b2 ofthe data space D2, and property a1 of the data space D1 is related toproperty bM of the data space D2. Further, the first user preferencedata UP1 received from the recommendation device 410 of FIG. 15, forexample, is indicated in FIG. 18 as well. Here, the first userpreference data UP1 is indicated as (1.0, 0.0, 0.2, . . . , 2.0).

The conversion unit 436 sequentially acquires the data valuescorresponding to properties b1, b2, . . . , bM from the first userpreference data UP1, for example, in accordance with the correspondenceindicated in FIG. 18 and generates the second user preference data UP2.In this case, the first user preference data UP2 is to be (0.0, 0.2, . .. , 1.0) in accordance with the abovementioned correspondence. Thesecond user preference data UP2 converted by the conversion unit 436 isoutputted to the recommendation device 420 of FIG. 15, for example.

Provided that the user preference data are able to be mutually convertedbetween the different data spaces as described above, it is allowed tocirculate or reuse the user preference data possibly generated invarious data spaces among devices or systems. At this point, it is notnecessary to modify an existing application or database which is mountedon each device or system. Therefore, it becomes possible to increaseopportunities of providing the recommendation service by utilizing theapplication or database without requiring additional cost.

Modified Example

FIG. 19 is a block diagram which illustrates the logical configurationof an information processing apparatus 530 according to a modifiedexample of the data conversion apparatus. As illustrated in FIG. 19, theinformation processing apparatus 530 includes a parameter DB 532, acompression unit 534 and a conversion unit 536.

Similarly to the parameter DB 432 of FIG. 16, the parameter DB 532stores the first and second sets of parameters. The first and secondsets of parameters are learned by utilizing the abovementioned learningapparatus 100. The first and second sets of parameters may berespectively the model parameters in accordance with the multi-topicmodel.

The compression unit 534 generates the first compressed user preferencedata from the first user preference data inputted from the conversionunit 536 by utilizing the abovementioned first set of parametersacquired from the parameter DB 532. Then, the compression unit 534outputs the generated first compressed user preference data to theconversion unit 536.

When the first compressed user preference data is generated by thecompression unit 534, the conversion unit 536 determines a likely seconduser preference data capable of generating the second compressed userpreference data which is equal to the first compressed user preferencedata by utilizing the second set of parameters stored at the parameterDB 532. More specifically, for example, the conversion unit 536generates, by the predetermined number of trials, the second userpreference data capable of generating the second compressed userpreference data which is equal to the first compressed user preferencedata in accordance with the probability distribution of equation 1.Here, it is preferable that the predetermined number of trials is set tobe large as the absolute value of the regarded vector of the first userpreference data is large. The conversion unit 536 may output the seconduser preference data, for example, which is determined as mentionedabove as the conversion result of the user preference data.

6. Summary

Up to this point, the learning apparatus, the recommendation apparatus,the data accumulation apparatus and the data conversion apparatusaccording to embodiments of the present invention have been described indetail with reference to FIGS. 8 to 19. With these embodiments, itbecomes possible to commonly manage user preferences and contentproperties among domains of different data spaces. Accordingly, thebenefits of performing cross-domain recommendation, accuracy improvingof recommendation result, opportunity increasing of recommendationservice providing and the like are advantageously expected.

Note that, the series of processes according to each embodimentdescribed in this specification may be performed no matter by hardwareor software. In the case where the series of or a part of processes isperformed by software, programs constituting the software are executedby utilizing a computer mounted into a specific hardware or a generalpurpose computer of FIG. 20, for example.

In FIG. 20, a central processing unit (CPU) 902 controls overalloperation of the general purpose computer. A read only memory (ROM) 904stores programs or data in which a part or the whole of the series ofprocesses are described. A random access memory (RAM) 906 temporallystores the program or data which is used by the CPU 902 at the time ofprocess execution.

The CPU 902, the ROM 904 and the RAM 906 are mutually connected via abus 910. In addition, an input-output interface 912 is connected to thebus 910.

The input-output interface 912 connects the CPU 902, the ROM 904 and theRAM 906 with an input device 920, an output device 922, a storage device924, a communication device 926 and a drive 930.

The input device 920 receives an instruction or an information inputfrom a user via an input instrument such as a mouse, a keyboard, a touchpanel, a button and a switch, for example. The output device 922 outputsinformation to the user via a display instrument such as CRT, PDP, LCDand OLED or an audio output instrument such as a speaker.

The storage device 924 is configured with a hard disk drive, a flashmemory or the like, for example, and stores programs and data. Thecommunication device 926 performs a communication process via a networksuch as a LAN and the internet. The drive 930 is arranged at the generalpurpose computer as needed. For example, a removable medium 932 ismounted to the drive 930.

In the case where the abovementioned series of processes are performedby software, the program stored at the ROM 904, the storage device 924or the removable medium 932 of FIG. 20, for example, is read into theRAM 906 at the time of execution and executed by the CPU 902.

It should be understood by those skilled in the art that variousmodifications, combinations, sub-combinations and alterations may occurdepending on design requirements and other factors insofar as they arewithin the scope of the appended claims or the equivalents thereof.

In the example of the description of this specification, userpreferences or content properties are capable of being commonly managedamong domains corresponding to two different data spaces. However, it isobvious that the present invention is applicable to three or more dataspaces.

The present application contains subject matter related to thatdisclosed in Japanese Priority Patent Application JP 2009-017190 filedin the Japan Patent Office on Jan. 28, 2009, the entire content of whichis hereby incorporated by reference.

1. A learning apparatus comprising: a first data acquisition unit whichacquires first user preference data belonging to a first data space; asecond data acquisition unit which acquires second user preference dataof a user in common with the first user preference data, the second userpreference data belonging to a second data space which is different fromthe first data space; a compression unit which generates firstcompressed user preference data having less data item number from thefirst user preference data by utilizing a first set of parameters; and alearning unit which learns a second set of parameters utilized forgenerating second compressed user preference data having the same dataitem number as that of the first compressed user preference data fromthe second user preference data so that difference between the firstcompressed user preference data and the second compressed userpreference data is to be small across a plurality of users.
 2. Thelearning apparatus according to claim 1, wherein the learning unitlearns the second set of parameters with the first compressed userpreference data generated by the compression unit as training data forthe second compressed user preference data.
 3. The learning apparatusaccording to claim 2, wherein the compression unit generates the firstcompressed user preference data in accordance with a multi-topic model.4. The learning apparatus according to claim 3, wherein the first set ofparameters and the second set of parameters are sets of parameterscorresponding to intrinsic distribution of a topic of the multi-topicmodel.
 5. The learning apparatus according to claim 1, wherein the firstdata space and the second data space are data spaces corresponding tocontent domains which are different from each other.
 6. The learningapparatus according to claim 1, wherein the first data space and thesecond data space are data spaces of user preference data generated bydevices which are different from each other.
 7. A learning method,comprising the steps of: acquiring first user preference data belongingto a first data space; acquiring second user preference data of a userin common with the first user preference data, the second userpreference data belonging to a second data space which is different fromthe first data space; generating first compressed user preference datahaving less data item number from the first user preference data byutilizing a first set of parameters; and learning a second set ofparameters utilized for generating second compressed user preferencedata having the same data item number as that of the first compresseduser preference data from the second user preference data so thatdifference between the first compressed user preference data and thesecond compressed user preference data is to be small across a pluralityof users.
 8. A program for causing a computer which controls aninformation processing apparatus to perform functions comprising: afirst data acquisition unit which acquires first user preference databelonging to a first data space; a second data acquisition unit whichacquires second user preference data of a user in common with the firstuser preference data, the second user preference data belonging to asecond data space which is different from the first data space; acompression unit which generates first compressed user preference datahaving less data item number from the first user preference data byutilizing a first set of parameters; and a learning unit which learns asecond set of parameters utilized for generating second compressed userpreference data having the same data item number as that of the firstcompressed user preference data from the second user preference data sothat difference between the first compressed user preference data andthe second compressed user preference data is to be small across aplurality of users.
 9. An information processing apparatus, comprising:a data acquisition unit which acquires first user preference databelonging to a first data space; a compression unit which generatesfirst compressed user preference data having less data item number fromthe first user preference data by utilizing a first set of parameters; astorage unit which stores a plurality of data having the same data itemnumber as that of the first compressed user preference data, theplurality of data being generated from second user preference data orcontent property data belonging to a second data space which isdifferent from the first data space by utilizing a second set ofparameters; and a selection unit which selects at least single data fromthe plurality of data stored at the storage unit according to similaritydegree to the first compressed user preference data generated by thecompression unit, wherein the plurality of data stored at the storageunit are respective data previously generated by utilizing the secondset of parameters which is learned so that difference between the firstcompressed user preference data and second compressed user preferencedata generated from the second user preference data of a common user isto be small across a plurality of users.
 10. A data selection method,comprising the steps of: acquiring first user preference data belongingto a first data space; generating first compressed user preference datahaving less data item number from the first user preference data byutilizing a first set of parameters; and selecting at least single datafrom a plurality of data having the same data item number as that of thefirst compressed user preference data according to similarity degree tothe first compressed user preference data, the plurality of data beinggenerated from second user preference data or content property databelonging to a second data space which is different from the first dataspace by utilizing a second set of parameters, wherein the plurality ofdata are respective data previously generated by utilizing the secondset of parameters which is learned so that difference between the firstcompressed user preference data and second compressed user preferencedata generated from the second user preference data of a common user isto be small across a plurality of users.
 11. A program for causing acomputer which controls an information processing apparatus to performfunctions comprising: a data acquisition unit which acquires first userpreference data belonging to a first data space; a compression unitwhich generates first compressed user preference data having less dataitem number from the first user preference data by utilizing a first setof parameters; a storage unit which stores a plurality of data havingthe same data item number as that of the first compressed userpreference data, the plurality of data being generated from second userpreference data or content property data belonging to a second dataspace which is different from the first data space by utilizing a secondset of parameters; and a selection unit which selects at least singledata from the plurality of data stored at the storage unit according tosimilarity degree to the first compressed user preference data generatedby the compression unit, wherein the plurality of data stored at thestorage unit are respective data previously generated by utilizing thesecond set of parameters which is learned so that difference between thefirst compressed user preference data and second compressed userpreference data generated from the second user preference data of acommon user is to be small across a plurality of users.
 12. Aninformation processing apparatus, comprising: a first data acquisitionunit which acquires first user preference data belonging to a first dataspace; a second data acquisition unit which acquires second userpreference data belonging to a second data space which is different fromthe first data space; a first compression unit which generates firstcompressed user preference data having less data item number from thefirst user preference data by utilizing a first set of parameters andstores the first compressed user preference data at a recording medium;and a second compression unit which generates second compressed userpreference data having the same data item number as that of the firstcompressed user preference data from the second user preference data byutilizing a second set of parameters and stores the second compresseduser preference data at a recording medium, wherein the first set ofparameters or the second set of parameters is a set of parameters whichis learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.
 13. A dataaccumulation method, comprising the steps of: acquiring first userpreference data belonging to a first data space; acquiring second userpreference data belonging to a second data space which is different fromthe first data space; generating first compressed user preference datahaving less data item number from the first user preference data byutilizing a first set of parameters and storing the first compresseduser preference data at a recording medium; and generating secondcompressed user preference data having the same data item number as thatof the first compressed user preference data from the second userpreference data by utilizing a second set of parameters and storing thesecond compressed user preference data at a recording medium, whereinthe first set of parameters or the second set of parameters is a set ofparameters which is learned so that difference between the firstcompressed user preference data and the second compressed userpreference data of a common user is to be small across a plurality ofusers.
 14. A program for causing a computer which controls aninformation processing apparatus to perform functions comprising: afirst data acquisition unit which acquires first user preference databelonging to a first data space; a second data acquisition unit whichacquires second user preference data belonging to a second data spacewhich is different from the first data space; a first compression unitwhich generates first compressed user preference data having less dataitem number from the first user preference data by utilizing a first setof parameters and stores the first compressed user preference data at arecording medium; and a second compression unit which generates secondcompressed user preference data having the same data item number as thatof the first compressed user preference data from the second userpreference data by utilizing a second set of parameters and stores thesecond compressed user preference data at a recording medium, whereinthe first set of parameters or the second set of parameters is a set ofparameters which is learned so that difference between the firstcompressed user preference data and the second compressed userpreference data of a common user is to be small across a plurality ofusers.
 15. An information processing apparatus, comprising: a storageunit which stores a first set of parameters to generate first compresseduser preference data having less data item number from first userpreference data belonging to a first data space and a second set ofparameters to generate second compressed user preference data having thesame data item number as that of the first compressed user preferencedata from second user preference data belonging to a second data spacewhich is different from the first data space; and a conversion unitwhich converts the first user preference data into the second userpreference data based on the first set of parameters and the second setof parameters which are stored at the storage unit, wherein the firstset of parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.
 16. Theinformation processing apparatus according to claim 15, wherein theconversion unit converts the first user preference data into the seconduser preference data in accordance with correspondence between dataitems of the first user preference data and data items of the seconduser preference data, the correspondence being determined according tosimilarity degree of parameter values of each data item between thefirst set of parameters and the second set of parameters.
 17. Theinformation processing apparatus according to claim 15, furthercomprising: a compression unit which generates the first compressed userpreference data from the first user preference data by utilizing thefirst set of parameters, wherein the conversion unit determines likelysecond user preference data capable of generating the first compresseduser preference data generated by the compression unit by utilizing thesecond set of parameters as the second user preference data convertedfrom the first user preference data.
 18. A data conversion method,comprising: a step of converting first user preference data into seconduser preference data based on a first set of parameters to generatefirst compressed user preference data having less data item number fromthe first user preference data belonging to a first data space and asecond set of parameters to generate second compressed user preferencedata having the same data item number as that of the first compresseduser preference data from the second user preference data belonging to asecond data space which is different from the first data space, whereinthe first set of parameters or the second set of parameters is a set ofparameters which is learned so that difference between the firstcompressed user preference data and the second compressed userpreference data of a common user is to be small across a plurality ofusers.
 19. A program for causing a computer which controls aninformation processing apparatus to perform functions comprising: astorage unit which stores a first set of parameters to generate firstcompressed user preference data having less data item number from firstuser preference data belonging to a first data space and a second set ofparameters to generate second compressed user preference data having thesame data item number as that of the first compressed user preferencedata from second user preference data belonging to a second data spacewhich is different from the first data space; and a conversion unitwhich converts the first user preference data into the second userpreference data based on the first set of parameters and the second setof parameters which are stored at the storage unit, wherein the firstset of parameters or the second set of parameters is a set of parameterswhich is learned so that difference between the first compressed userpreference data and the second compressed user preference data of acommon user is to be small across a plurality of users.